Goto

Collaborating Authors

 uplift model


Augmenting Limited and Biased RCTs through Pseudo-Sample Matching-Based Observational Data Fusion Method

Han, Kairong, Huang, Weidong, Zhou, Taiyang, Zhen, Peng, Kuang, Kun

arXiv.org Machine Learning

In the online ride-hailing pricing context, companies often conduct randomized controlled trials (RCTs) and utilize uplift models to assess the effect of discounts on customer orders, which substantially influences competitive market outcomes. However, due to the high cost of RCTs, the proportion of trial data relative to observational data is small, which only accounts for 0.65\% of total traffic in our context, resulting in significant bias when generalizing to the broader user base. Additionally, the complexity of industrial processes reduces the quality of RCT data, which is often subject to heterogeneity from potential interference and selection bias, making it difficult to correct. Moreover, existing data fusion methods are challenging to implement effectively in complex industrial settings due to the high dimensionality of features and the strict assumptions that are hard to verify with real-world data. To address these issues, we propose an empirical data fusion method called pseudo-sample matching. By generating pseudo-samples from biased, low-quality RCT data and matching them with the most similar samples from large-scale observational data, the method expands the RCT dataset while mitigating its heterogeneity. We validated the method through simulation experiments, conducted offline and online tests using real-world data. In a week-long online experiment, we achieved a 0.41\% improvement in profit, which is a considerable gain when scaled to industrial scenarios with hundreds of millions in revenue. In addition, we discuss the harm to model training, offline evaluation, and online economic benefits when the RCT data quality is not high, and emphasize the importance of improving RCT data quality in industrial scenarios. Further details of the simulation experiments can be found in the GitHub repository https://github.com/Kairong-Han/Pseudo-Matching.


An Example Safety Case for Safeguards Against Misuse

Clymer, Joshua, Weinbaum, Jonah, Kirk, Robert, Mai, Kimberly, Zhang, Selena, Davies, Xander

arXiv.org Artificial Intelligence

Existing evaluations of AI misuse safeguards provide a patchwork of evidence that is often difficult to connect to real-world decisions. To bridge this gap, we describe an end-to-end argument (a "safety case") that misuse safeguards reduce the risk posed by an AI assistant to low levels. We first describe how a hypothetical developer red teams safeguards, estimating the effort required to evade them. Then, the developer plugs this estimate into a quantitative "uplift model" to determine how much barriers introduced by safeguards dissuade misuse (https://www.aimisusemodel.com/). This procedure provides a continuous signal of risk during deployment that helps the developer rapidly respond to emerging threats. Finally, we describe how to tie these components together into a simple safety case. Our work provides one concrete path -- though not the only path -- to rigorously justifying AI misuse risks are low.


Improve ROI with Causal Learning and Conformal Prediction

Ai, Meng, Chen, Zhuo, Wang, Jibin, Shang, Jing, Tao, Tao, Li, Zhen

arXiv.org Artificial Intelligence

Abstract--In the commercial sphere, such as operations and maintenance, advertising, and marketing recommendations, intelligent decision-making utilizing data mining and neural network technologies is crucial, especially in resource allocation to optimize ROI. This study delves into the Cost-aware Binary Treatment Assignment Problem (C-BTAP) across different industries, with a focus on the state-of-the-art Direct ROI Prediction (DRP) method. A larger area under the curve indicates better performance. Three popular methods have been proposed for tackling C-BTAP: 1) Two-Phase I. First, TPM utilized uplift models, such as In a wide range of commercial activities, intelligent decisionmaking meta-learners [11], [12], causal forests [6], [13]-[15], or neural based on data mining and neural network technologies network based representation learning [16]-[18] approaches, is playing an increasingly important role. One crucial aspect of to predict the revenue lift and cost lift, respectively. Then, this intelligent decision-making is figuring out how to allocate a calculation is performed by dividing the revenue uplift limited resources in order to maximize returns, essentially prediction by the cost uplift prediction. For instance, of revenue uplift model and cost uplift model may cause an in the field of operations and maintenance, how to allocate enlargement of model errors due to the mathematical operations machine resources and computational power to maximize the during combination; 2) For the method of Direct Rank (DR), revenue of supported businesses [1]; in the advertising sector, a loss function aimed at ranking individuals' ROI is created, how to distribute an advertiser's total budget reasonably to as noted in [9]. However, [5] demonstrate that achieving maximize the revenue from their products [2]; and in the accurate ranking is not possible when the loss function fully realms of recommendation and marketing, how to allocate converges because the loss function is not convex, which is suitable coupons, discounts, and coins as incentives to users in also detailed in Appendix E of [5]; 3) based on our research order to maximize platform user retention, GMV, etc [3]-[8]. of the published literature, the Direct ROI Prediction (DRP) In causal inference, actions such as adjusting the computational method [5], presented at AAAI 2023, remains the state-ofthe-art power for a specific business operation, modulating (SOTA) for C-BTAP so far. DRP designs a convex the cost of a particular advertisement, and offering incentives loss function for neural networks to guarantee an unbiased of varying value, as mentioned in the above examples, are estimation of ROI of individuals when the loss converges.


Rankability-enhanced Revenue Uplift Modeling Framework for Online Marketing

He, Bowei, Weng, Yunpeng, Tang, Xing, Cui, Ziqiang, Sun, Zexu, Chen, Liang, He, Xiuqiang, Ma, Chen

arXiv.org Artificial Intelligence

Uplift modeling has been widely employed in online marketing by predicting the response difference between the treatment and control groups, so as to identify the sensitive individuals toward interventions like coupons or discounts. Compared with traditional \textit{conversion uplift modeling}, \textit{revenue uplift modeling} exhibits higher potential due to its direct connection with the corporate income. However, previous works can hardly handle the continuous long-tail response distribution in revenue uplift modeling. Moreover, they have neglected to optimize the uplift ranking among different individuals, which is actually the core of uplift modeling. To address such issues, in this paper, we first utilize the zero-inflated lognormal (ZILN) loss to regress the responses and customize the corresponding modeling network, which can be adapted to different existing uplift models. Then, we study the ranking-related uplift modeling error from the theoretical perspective and propose two tighter error bounds as the additional loss terms to the conventional response regression loss. Finally, we directly model the uplift ranking error for the entire population with a listwise uplift ranking loss. The experiment results on offline public and industrial datasets validate the effectiveness of our method for revenue uplift modeling. Furthermore, we conduct large-scale experiments on a prominent online fintech marketing platform, Tencent FiT, which further demonstrates the superiority of our method in real-world applications.


Fairness Evaluation for Uplift Modeling in the Absence of Ground Truth

Kadioglu, Serdar, Michalsky, Filip

arXiv.org Artificial Intelligence

The acceleration in the adoption of AI-based automated decision-making systems poses a challenge for evaluating the fairness of algorithmic decisions, especially in the absence of ground truth. When designing interventions, uplift modeling is used extensively to identify candidates that are likely to benefit from treatment. However, these models remain particularly susceptible to fairness evaluation due to the lack of ground truth on the outcome measure since a candidate cannot be in both treatment and control simultaneously. In this article, we propose a framework that overcomes the missing ground truth problem by generating surrogates to serve as a proxy for counterfactual labels of uplift modeling campaigns. We then leverage the surrogate ground truth to conduct a more comprehensive binary fairness evaluation. We show how to apply the approach in a comprehensive study from a real-world marketing campaign for promotional offers and demonstrate its enhancement for fairness evaluation.


Robustness-enhanced Uplift Modeling with Adversarial Feature Desensitization

Sun, Zexu, He, Bowei, Ma, Ming, Tang, Jiakai, Wang, Yuchen, Ma, Chen, Liu, Dugang

arXiv.org Artificial Intelligence

Uplift modeling has shown very promising results in online marketing. However, most existing works are prone to the robustness challenge in some practical applications. In this paper, we first present a possible explanation for the above phenomenon. We verify that there is a feature sensitivity problem in online marketing using different real-world datasets, where the perturbation of some key features will seriously affect the performance of the uplift model and even cause the opposite trend. To solve the above problem, we propose a novel robustness-enhanced uplift modeling framework with adversarial feature desensitization (RUAD). Specifically, our RUAD can more effectively alleviate the feature sensitivity of the uplift model through two customized modules, including a feature selection module with joint multi-label modeling to identify a key subset from the input features and an adversarial feature desensitization module using adversarial training and soft interpolation operations to enhance the robustness of the model against this selected subset of features. Finally, we conduct extensive experiments on a public dataset and a real product dataset to verify the effectiveness of our RUAD in online marketing. In addition, we also demonstrate the robustness of our RUAD to the feature sensitivity, as well as the compatibility with different uplift models.

  Country:
  Genre: Research Report (1.00)
  Industry: Marketing (0.75)

The impact of heteroskedasticity on uplift modeling

Bokelmann, Björn, Lessmann, Stefan

arXiv.org Machine Learning

There are various applications, where companies need to decide to which individuals they should best allocate treatment. To support such decisions, uplift models are applied to predict treatment effects on an individual level. Based on the predicted treatment effects, individuals can be ranked and treatment allocation can be prioritized according to this ranking. An implicit assumption, which has not been doubted in the previous uplift modeling literature, is that this treatment prioritization approach tends to bring individuals with high treatment effects to the top and individuals with low treatment effects to the bottom of the ranking. In our research, we show that heteroskedastictity in the training data can cause a bias of the uplift model ranking: individuals with the highest treatment effects can get accumulated in large numbers at the bottom of the ranking. We explain theoretically how heteroskedasticity can bias the ranking of uplift models and show this process in a simulation and on real-world data. We argue that this problem of ranking bias due to heteroskedasticity might occur in many real-world applications and requires modification of the treatment prioritization to achieve an efficient treatment allocation.


Improving uplift model evaluation on RCT data

Bokelmann, Björn, Lessmann, Stefan

arXiv.org Machine Learning

Estimating treatment effects is one of the most challenging and important tasks of data analysts. In many applications, like online marketing and personalized medicine, treatment needs to be allocated to the individuals where it yields a high positive treatment effect. Uplift models help select the right individuals for treatment and maximize the overall treatment effect (uplift). A major challenge in uplift modeling concerns model evaluation. Previous literature suggests methods like the Qini curve and the transformed outcome mean squared error. However, these metrics suffer from variance: their evaluations are strongly affected by random noise in the data, which renders their signals, to a certain degree, arbitrary. We theoretically analyze the variance of uplift evaluation metrics and derive possible methods of variance reduction, which are based on statistical adjustment of the outcome. We derive simple conditions under which the variance reduction methods improve the uplift evaluation metrics and empirically demonstrate their benefits on simulated and real-world data. Our paper provides strong evidence in favor of applying the suggested variance reduction procedures by default when evaluating uplift models on RCT data.


Direct Heterogeneous Causal Learning for Resource Allocation Problems in Marketing

Zhou, Hao, Li, Shaoming, Jiang, Guibin, Zheng, Jiaqi, Wang, Dong

arXiv.org Artificial Intelligence

Marketing is an important mechanism to increase user engagement and improve platform revenue, and heterogeneous causal learning can help develop more effective strategies. Most decision-making problems in marketing can be formulated as resource allocation problems and have been studied for decades. Existing works usually divide the solution procedure into two fully decoupled stages, i.e., machine learning (ML) and operation research (OR) -- the first stage predicts the model parameters and they are fed to the optimization in the second stage. However, the error of the predicted parameters in ML cannot be respected and a series of complex mathematical operations in OR lead to the increased accumulative errors. Essentially, the improved precision on the prediction parameters may not have a positive correlation on the final solution due to the side-effect from the decoupled design. In this paper, we propose a novel approach for solving resource allocation problems to mitigate the side-effects. Our key intuition is that we introduce the decision factor to establish a bridge between ML and OR such that the solution can be directly obtained in OR by only performing the sorting or comparison operations on the decision factor. Furthermore, we design a customized loss function that can conduct direct heterogeneous causal learning on the decision factor, an unbiased estimation of which can be guaranteed when the loss converges. As a case study, we apply our approach to two crucial problems in marketing: the binary treatment assignment problem and the budget allocation problem with multiple treatments. Both large-scale simulations and online A/B Tests demonstrate that our approach achieves significant improvement compared with state-of-the-art.


Partial counterfactual identification and uplift modeling: theoretical results and real-world assessment

Verhelst, Théo, Mercier, Denis, Shrestha, Jeevan, Bontempi, Gianluca

arXiv.org Artificial Intelligence

An example of counterfactual statement is "I got no effect since I made no action but something would have happened had I acted". Counterfactuals are used in many fields, ranging from algorithmic recourse [Karimi et al., 2021] to online advertisement and customer relationship management [Li and Pearl, 2019]. Counterfactuals have been formally defined in terms of structural causal models by Pearl [2009]. Nevertheless, since a counterfactual statement cannot be directly observed, the research focuses on estimating or bounding their probability (e.g. the probability that we have an effect given a treatment and no effect else). The probability of some specific counterfactual expressions have been studied in the literature [Tian and Pearl, 2000] because of their relevance in causal decision-making. The probability of necessity (PN) is the probability that an event y would not have occurred in the absence of an action or treatment t, given that y and t in fact occurred. Conversely, the probability of sufficiency (PS) is the probability that event y would have occurred in the presence of an action t, given that both y and t in fact did not occur. Lastly, the probability of necessity and sufficiency (PNS) is the probability that the event y occurs if and only if the event t occurs. In the case of incomplete knowledge about the causal model, identification procedures indicate when and how the probability of counterfactuals can be computed from a combination of observational data, experimental data (i.e.